Single-nucleotide polymorphisms in ialB, gltA and rpoB genes of Bartonella bacilliformis isolated from patients in endemic Peruvian regions

Bartonella bacilliformis is a Gram-negative, aerobic bacterium and the known causal agent of Carrion’s disease, still considered a neglected disease. There is limited information about the nucleotide sequences of this bacterium in international databases, and few studies have addressed the genetic diversity of B. bacilliformis. We analyzed a total of 20 isolates of B. bacilliformis from the Peruvian regions of Ancash and Cajamarca. Three genes (ialB, gltA, and rpoB) were sequenced in each isolate and nucleotide sequences retrieved from GenBank (16 B. bacilliformis genomes) were also included in the study. All this information was merged in order to obtain clearer evidence of the phylogenetic relationships of B. bacilliformis. In the phylogenetic analysis conducted with the concatenated markers, four isolates (B.b-1, B. b-3, B. b- 7, B.b-8) from the Ancash region were observed to form a subgroup different from B. bacilliformis type strain KC583, showing dissimilarity levels of 5.96% (ialB), 3.69% (gltA) and 3.04% (rpoB). Our results suggest that B. bacilliformis consists of two different subgroups. Future investigations are needed to establish the taxonomic status of these subgroups.


Introduction
Bartonella bacilliformis is a facultative intracellular pathogen and the causative agent of Carrion's disease [1].This pathogen is transmitted by sand flies (Lutzomyia spp.); therefore the distribution of the disease is associated with altitudes in which these arthropods are found [2].Therefore, the distribution of this pathogen is restricted to the Andean valleys of Peru and Ecuador, [2,3].While reported in Southern Colombia during the last century, no confirmed case has been reported in this country in the last few decades [4].
Carrion's disease displays an acute phase manifesting with hemolytic anemia (named Oroya fever), which often precedes a chronic phase with skin lesions (named Peruvian Wart) [1].Intense red blood cell destruction occurs in the acute phase, leading to severe hemolytic anemia and temporary immunosuppression [5,6].The warts or eruptive lesions manifested in the chronic phase have several forms of presentation: miliary, mular, or nodular, with bleeding by the wart lesion being the most serious complication during this phase [4,6].
However, the biphasic manifestation is not reported in all patients.Thus, while some patients do not develop the verrucous phase after an acute episode, others begin with a verrucous episode without a previous report of an acute episode [4].For instance, there is a predominance of the acute phase of the disease over the verrucous period in patients from the Monzo ´n Valley in Hua ´nuco and the Sacred Valley of the Incas in Cuzco [7].
The differences between these clinical pictures could be associated with the characteristics inherent to the circulating genotypes; hence this study aimed to determine the possible genetic differences in B. bacilliformis isolated from patients with Carrion's disease in the Peruvian departments of Ancash and Cajamarca.

Reactivation of cryopreserved strains
Twenty B. bacilliformis strains (10 from the Ancash department and 10 from the Cajamarca department) were reactivated in biphasic medium (RPMI Medium1640 with L-glutamine and bicarbonate (Gibco, Paisley, UK), and subcultured in Columbia agar medium (Columbia blood agar base; OXOID, Hampshire, UK) enriched at 10% with sheep blood.All the strains included in the study were isolated from patients for diagnostic purposes and provided by the Laboratorio de Referencia Nacional de Metaxe ´nicas y Zoonosis Bacterianas of the Instituto Nacional de Salud, Peru.Additionally, the strain KC583 (ATCC 35685) was also reactivated to be used as a sequencing reference.The origin and year of the isolates are described in Table 1.

DNA extraction and amplification
Colonies were harvested and DNA was extracted using a commercial kit (GeneJet NGS Cleanup Kit; Thermo Fisher Scientific, Vilnius, Lithuania) following the manufacturer's instructions.We employed previously designed oligonucleotides for the polymerase chain reaction (PCR) test of the three genes (Table 2).
The reactions were carried out in 40 μl of master mix containing 4.5 μL of PCR buffer 10X, 2.5 mM of MgCl 2 , 0.25 mM of dNTPs, 0.25 mM of each primer, 0.75 U/ μL of Platinum Taq DNA Polymerase (Thermo Fisher Scientific, Sao Paulo, Brazil), with at least 2 ng/μL of DNA template.The amplification protocol for ialB and gltA genes proceeded with 40 cycles at 94˚C for 30 sec, 55˚C for 30 sec, and 72˚C for 1 min.rpoB gene amplification was performed by 45 cycles at 95˚C for 30 sec, 51˚C for 30 sec, and 72˚C for 1 min.PCR products were visualized on a 1.6% agarose gel with nucleic acid gel stain (Syber gold; Life Technologies, Oregon, USA).

Sequencing reaction
The PCR products were purified using a commercial kit (GeneJet NGS Cleanup Kit; Thermo Fisher Scientific) according to the specifications of the manufacturer, and the purified products were processed using a commercial kit (BigDye Terminator v3.1 Cycle Sequencing kit; Thermo Fisher Scientific).The sequencing reactions were done using 35 cycles with the following conditions: denaturation at 96˚C for 10 sec, hybridization at 50˚C for 45 sec, and extension at 60˚C for 60 sec.The treated products were purified using in-house precipitation buffer; thus, each sample was treated using 3μL of 3M sodium acetate pH 5.4, 62.5 μL of absolute ethanol, and 14.5 μL of molecular grade water.
After mixing with precipitation buffer, the products were incubated at room temperature for 30 min and thereafter centrifuged at 2000g for 30 min.The pellets were resuspended in 90 μL of 70% ethyl alcohol, and then were centrifuged at 2000g for 15 min.10 μL of deionized formamide was added to the dried pellet, and samples were denatured at 96˚C for 2 min followed by fast cooling on ice.The DNA sequencing was performed using an Applied Biosystems Genetic Analyzer (3500xL, Thermo Fisher Scientific, United States).

Gene data analysis
Sixteen sequences from B. bacilliformis genomes and four sequences from genomes of other Bartonella spp.available in the NCBI GenBank database were included for multiple alignments and the elaboration of the phylogenetic tree (Table 3).In addition, the analysis of gltA included sequences previously reported by Birtles et al. (strain LA6.3) [11] and Lydy et al. (strain Vega) [12].
The quality and assembly of the sequences were determined using the free program Seq Trace (https://bio.tools/seqtrace),which evaluated the reading of each base.To determine the identity of the sequences obtained at the genus and species level, the sequences were BLAST searched against the sequences deposited in the NCBI GenBank database (https://blast.ncbi.nlm.nih.gov/Blast.cgi).The Weblogo server was employed for easy visualization of amino acid multiple sequence alignment, with the overall height of the stack indicating the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position [13].

Bioinformatic analysis of the ialB region: a gene associated with red blood cell invasion
In the analysis of the ialB gene, four  5; Fig 1).The DNA sequences of these 4 isolates were 100% identical to the ialB sequence of B. bacilliformis Ver097 (NZ_KL503803.1)(identity percentage, 100%).In the multiple-sequence alignment of the remaining 16 strains, 10 from Cajamarca and six from Ancash, no variations were detected in comparison to the ATCC 35685 reference strain (S1 Fig).5, Fig 1B).As occurred with the ialB gene, the variations were identical to the isolate Ver097 and to the two non-genome gltA sequences obtained from Gen-Bank (Accession number: DQ200879.1;Z70021.1).The other data retrieved from the GenBank did not present any nucleotide variation with respect to the reference sequence.

Bioinformatic analysis of the rpoB gene: β subunit of RNA polymerase
In the bioinformatic analysis of the nucleotide sequences of the rpoB gene, 26 single-nucleotide polymorphisms were identified leading to 2 amino acid substitutions (Tables 4 and 5 -8), and as above, the substitutions were identical to those present in the Ver097 strain.The Cond044 strain presented 6 nucleotide variations and only 1 amino acid substitution 755 (Asp-> Asn).Likewise, strain Ver075 presented a single nucleotide variation which would not generate amino acid substitution.The remaining samples and retrieved data did not present any nucleotide variation relative to strain KC583.Changes were identified in the Cond044 sequence at positions 1752, 1806, 1878, 1896, 2109 and 2263 as well as in the Ver075 sequence at position 1786.
To sum up, four Ancash-isolates displayed amino acid substitutions in the three predicted proteins (IalB, GltA and RpoB) in comparison to the reference strain KC583.However, the amino acid substitutions were identical to the amino acids predicted in theVer097 strain (Table 5).

Phylogenetic analysis based on three gene markers
In all cases, phylogenetic analysis of the three analyzed genes presented two main groups.

Discussion
The Bartonella genus currently comprises 38 species with standing in nomenclature and an undefined number of Candidatus species, including uncultured isolates for which only partial DNA sequences have been obtained as well as isolates from which whole genomic data are

Strain
*Sequenced region https://doi.org/10.1371/journal.pntd.0011615.t004 available (https://lpsn.dsmz.de/search?word=bartonella).The DNA fragment most widely used to study the taxonomy of the Bartonella genus is 16S rDNA, but it is not recommended for speciation studies [17].Hence, the gltA and rpoB genes and 16S-23S rDNA ITS were used to classify Bartonella species [18,19].In relation to B. bacilliformis, the detection of the ialB gene has been recommended for identification purposes since it was considered to be an exclusive gene of this bacterium [8].However, further studies showed the presence of the ialB gene in other species of the genus and other genera, and this application was subsequently dismissed [20].In the present study, nucleotide sequences of three genes (ialB, gltA, and rpoB) of 20 isolates of B. bacilliformis from Ancash and Cajamarca were obtained, and 16 sequences of B. bacilliformis were retrieved from GenBank.These data were used to obtain clearer evidence of the phylogenetic relationship of B. bacilliformis.Additionally, this analysis helps to drive future investigations since this study is an initial screening with a cost-effective approach, which will allow the selection of B. bacilliformis strains for further whole genome sequencing.
To the best of knowledge, there are no data related to single-nucleotide polymorphisms in the ialB gene of B. bacilliformis.In the present study, in four of the 20 isolates, 16 synonymous mutations and 15 non-synonymous mutations were detected which resulted in 13 amino acid substitutions (Fig 1A).These amino acid substitutions were identified in four B. bacilliformis strains of Ancash, a Peruvian department in which the main clinical form of Carrions disease is the eruptive or verrucous phase [21].According to Coleman and Minnick, the disruption of the ialB gene of B. bacilliformis resulted in a 47% to 53% decrease in erythrocytes association compared to the wild-type, parental strain [22].Similarly, it has been reported that a inactivated ialB gene from Bartonella tribocorum or Bartonella birtlesii did not generate bacteremia or invasion in rat or mouse erythrocytes in in vivo tests [23,24].Hence, it could be inferred that the samples that present amino acid substitutions might have an altered affinity to red blood and epithelial cells because amino acid substitutions could generate structural and functional changes in the invasion protein, likewise, we assume amino acid substitutions could affect not only infection in the host or reservoir but also the clinical manifestations of the infection by B. bacilliformis.Additionally, according to Coleman et al., environmental changes (transition from sandfly to humans) could influence the expression of the ialB gene [22].Correlation studies determined that the highest expression of the ialB gene was in B. bacilliformis strains treated with temperatures of 20˚C [25].While further studies are necessary, several of the amino acid substitutions observed might have an impact on protein functionality as they result in changes from positively charged to other negatively charged amino acids (or vice versa).
Analysis of the gltA gene showed single-nucleotide mutations in the same four strains that contain mutations in the ialB gene and a degree of dissimilarity of 3.68%, with the sequences * Amino acid positions https://doi.org/10.1371/journal.pntd.0011615.t005 being identical to those of the Ver097, LA6.2 and Vega isolates.This result agrees with that obtained by Birtles et al. who analyzed a partial sequence of gltA in B. bacilliformis; while they showed that this sequence was highly conserved, the gltA sequences of isolates LA6.2, LA6.3 and Luc-Uba presented 3% dissimilarity with respect to other B. bacilliformis [11].Currently, the gltA gene is used to classify species or subgroups of Bartonella spp.Thus, previous studies comparing partial sequences of the gltA gene of Bartonella spp.found 83.8% -93.5% similarity between different Bartonella species and 99.6-99.8%similarity of isolates belonging to the same species [18].To the best of our knowledge, there are no studies related to mutations in the gltA gene of B. bacilliformis and the effect that these could have on GltA catalytic activity.It should be noted that previous studies about multi-locus sequence typing (MLST) studies did not include the ialB and gltA genes, but the present study confirms that the markers chosen have discriminatory power for discriminating between B. bacilliformis subgroups (S2 and S3, Fig 2).
Regarding the rpoB gene (Figs 1C and S1C), three out of four isolates presenting substitutions in the ialB and gltA genes also presented 26 nucleotide variations leading to two non-synonymous mutations.The rpoB gene of the remaining isolate (B.b-8) could not be amplified, the reason could not be determined but might be due to a polymorphism in the primer annealing region which prevents annealing and subsequent amplification.The results of the analysis had a coverage of 100% and a dissimilarity of 3.04% with respect the reference strain KC583.This latter result is in agreement with that obtained by other researchers who have reported a 3.0% dissimilarity in the rpoB gene during sequence comparison in a MLST analysis [26].It is important to point out that amino acid substitutions at RpoB can generate resistance to rifampicin in different microorganisms, including Mycobacterium tuberculosis [27].Regarding B. bacilliformis the presence of the amino acid substitutions at positions 527 (Gln-> Arg), 531 (Ser-> Phe), 540 (His-> Tyr), 545 (Ser-> Phe), and 588 (Ser-> Tyr) have been shown to be involved in the development of resistance to rifampicin [28,29].Nevertheless, amino acid positions 620 and 730 of B. bacilliformis have not been related to the development of rifampicin resistance.Furthermore they are outside the equivalent RpoB regions classically involved in this resistance and therefore probably represent polymorphisms [30].
Analyzing the ITS region, Houpikian et al. reported variations of dissimilarity between three subspecies of B. vinsonii [31] and suggested that these subgroups could belong to a different taxon within the species.In this study, the isolates B. -8 consistently presented dissimilarities >3% in all the genes analyzed when compared with the strain KC583, but a 100% identity with those of strain Ver097.Other studies have shown similar scenarios analyzing other B. bacilliformis genes, as for instance Pap31, with a 10.3% of dissimilarity between KC583 and Ver097 [32] and these differences are also reflected in multi-locus sequence typing analysis (MLST), with the strains Ver097, LA6.3, Luc-Uba as outliers [33].This scenario allows us to suggest that they belong to at least to a B. bacilliformis subspecies, This possibility was previously reported by Paul et al [34] who suggested that isolates Ver097, Luc-Uba and LA6.3 make up a B. bacilliformis subspecies.
Currently, there is no information on the ecological and geographical bases of the divergence in B. bacilliformis, however, we identified that there is a specific subgroup of B. bacilliformis present in the Ancash region.It could therefore be assumed that there is a sub-speciation process in the strains of this endemic zone of Peru.This statement is supported by Paul et al. who described potential sub-speciation in isolates related to Ver097, such as, for instance, LA6.3, and Luc-Uba, all three of which were isolated from Ancash [34].As mentioned, our findings show amino acid mutations that could affects protein functions, so we consider these non-synonymous mutations must be carefully analyzed to avoid inaccuracy during an in-silico analysis [35].
Nevertheless, there is a limited number of isolates, and further studies with a higher number of samples from Ancash and other regions are required for final conclusions to be made, especially if differences in clinical manifestations are identified in patients from Ancash where the eruptive phase is more predominant in comparison to other endemic areas, such as Cajamarca, Cusco and Hua ´nuco [18].A limitation of the present study is that it was based on the analysis of 3 relevant genes, and that larger data sets from whole genome sequencing will result in a more in-depth analysis.Nevertheless, the obtained results clearly highlight the presence of in-depth differences between both groups of B. bacilliformis included in the analysis.

Conclusion
The analysis of the single-nucleotide polymorphisms of the gltA, ialB and rpoB genes is supported by other investigations, providing more evidence of the genotypic diversity of B. bacilliformis circulating in Peru.However, this research needs to be reinforced by complete analysis of whole genome sequencing in a greater number of isolates from different regions of Peru.

Fig 1 .
Fig 1.Sequence logo of the multiple sequence alignment of amino acid sequence for the identification of non-synonymous mutations: A) IalB; B) GltA; C) RpoB.Amino acids are colored according to their chemical properties: polar amino acids are green, basic amino acids are blue, acidic amino acids are red, and hydrophobic amino acids are black.https://doi.org/10.1371/journal.pntd.0011615.g001

Fig 2 .
Fig 2. Concatenated phylogenetic tree based on the ialB, gltA, rpoB genes from 20 strains isolated from Cajamarca and Ancash, and 20 genomes downloaded from GenBank.For the construction of the tree, the NJ method with 1,000 replicates was used.The Bootstrap values are displayed between the branches respectively.The tree was made in the MEGA 7.0 program.https://doi.org/10.1371/journal.pntd.0011615.g002 b.-1, B.b.-3, B.b.-7 and B.b.